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Abstract — The whole world has been digitized by multiple 
satellite remote-sensing systems. Enormous number of remote 
sensing images has been collected by satellites. This leads to an 
exponential increase in the quantity of remotely sensed images 
in database. These images are being used for environmental 
monitoring, disaster forecasting, geological survey, and other 
applications. Accurate and quick retrieval of satellite images 
from huge databases is a challenge. This paper implements a 
prototype system for remote sensing image retrieval system 
using different high level feature of images through five 
different methods, three were based on analysis of color 
feature and other two based on analysis of texture features. 
Experiment was carried out on test images database and results 
are analysed with the help of retrieval accuracy, classification, 
confusion matrix and precision and recall plot. 

Index Terms —Remote sensing, Color autocorrelogram, 
Color moments, DWT, Gabor filter, HSV histogram. 

I. INTRODUCTION 

Content-based image retrieval measures the visual similarity 
between a query image and database images. The retrieval 
result is an images similar with the query image. By extracting 
the feature vectors of the query image and the database 
images, there is need to develop similarity measures that will 
rank the database images by the actual distance between their 
vectors and the query image vector. Content Based Image 
Retrieval systems think that the best retrieving images are the 
most visually similar images to a given query image from a 
large collection of images. It is index visual characteristics of 
an image, such as its color, textures and shape to look for an 
explicit image in a large amount of images. 

Advancement in satellite system has resulted in diverse 
remote sensed data which is increasing rapidly. Remote 
Sensing Image Retrieval is an application of CBIR techniques 
to remote sensing archive. Satellite images represent large 
amount of complex geographical data. For Spatial 
information retrieval consideration of low level features of 
images would not be sufficient. 

This paper proposes a methodology to retrieve RS images by 
applying high level features. RS image is first segmented and 
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over segmented regions are merged to locate region of interest 
and extract visual features. Then region features are extracted 
in the form of color and texture. 

The reminder of this paper is organized as follows. In section 
2, a brief discussion about the research works related to the 
content based image retrieval in remote sensing images is 
given. Section 3 introduces our proposed method. 
Implementation and results are shown in section 4. 
Conclusion of the paper is given in section 5. 

II. LITERATURE REVIEW 

A comprehensive study of the content based image 
retrieval has been done in the past few years. The survey 
includes 100+ papers covering the research aspects of many 
authors. Yong Rui, Thomas S. Huang, Shih-Fu Chang has 
reviewed image feature representation and extraction, 
multi-dimensional indexing, and system design, three of the 
fundamental bases of content-based image retrieval [1]. K. 
Vijay Kumar has provided a comprehensive survey of the 
recent technical achievements in high-level semantic-based 
image retrieval in [2]. 

Tingting Liu, Liangpei Zhang, Pingxiang Li and Hui Lin 
have implemented the approach of semantic mining [3]. 
Homogeneous spectral and textural characteristics are 
extracted from image regions, and then a uniform 
region-based representation for each image is built. In [13] 
authors have designed an object-based confusion matrix 
(OCM) classification accuracy assessment scheme to 
accurately estimate the overall and individual category 
classification accuracy. Neera Lai, Neetesh Gupta, Amit 
Sinhal in their paper [11] have compared the techniques of 
image classification in CBIR and also introduced classifiers 
like support vector machine, Bayesian classifier for accurate 
and efficient retrieval of images. Many authors have worked 
on high level color and texture feature extraction [5-10]. 

III. Methodology 

Remote sensing image retrieval systems present mechanisms 
for choosing the data items that resemble most a query image 
among all the accessible data in a database. The proposed 
system will be mainly comprised of six stages. Fig.l shows 
flow of retrieval process. 
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Fig 1. Methodology for image retrieval 


A. Image Segmentation 

Image segmentation is considered as a corner stone of any 
image retrieval system. Segmentation refers to the process of 
partitioning a digital image into multiple segments. Satellite 
images are rich in color and texture. It is difficult to identify 
image regions containing color-texture patterns. JSEG [4] is a 
region based segmentation algorithm for segmentation of 
color images. JSEG stands for j value segmentation. JSEG is 
based on the perception of region growing. It is robust system 
of segmenting natural images. The JSEG algorithm simplifies 
color and quality of images. 

Here colors in the image are first quantized and after 
quantization, the quantized colors are assigned labels. A color 
class is the set of image pixels quantized to the same color. 
The image pixel colors are replaced by their corresponding 
color class labels. The new constructed image of labels is 
called a class-map. J images are grey scale images then 
formed by applying windows of different sizes to class map. 
Next step is spatial segmentation where region growing 
method is used on multi-scale J images. After segmentation 
last step is region merge to avoid over segmentation and we 
get final segmented image. 

B. Region Information Description 

The natural scene description means to describe the region, 
or the area, by regional shape descriptors. Regional 
information in RS images is described using color and texture 
features. Color features are extracted using Color 
Autocorrelogram, Color Moments and HSV Histogram 
whereas texture features are extracted using Gabor filters and 
Discrete Wavelet Transform. These features are extracted 
separately for each region in all images. 

1) Color Autocorrelogram 

This is the most common second order statistical measures 
used in image retrieval are based on the correlation function 


between the image pixels. Image correlogram describes the 
correlation of the image colors as a function of their spatial 
distance. The autocorrelogram is a subset of the correlogram, 
and it gives the probability of finding identical colors at 
certain distance [6]. 

The definition of the correlogram is the following . Let [D] 
denote a set of D fixed distances {dl„..., dD}. Then the 
correlogram of the image I is defined for level pair (gi, gj) at a 
distance d. 

ifa e 1 Ik - Pi = d l)] 

\ s i 9 jJ (1) 

which gives the probability that given any pixel pi of level gi, 
a pixel p2 at a distance d in certain direction from the given 
pixel pi is of level gi. Autocorrelogram captures the spatial 
correlation of identical levels only: 

(f) = fTl 

a s y $,§ vJ (2) 

It gives the probability that pixels pi and p2, d away from 
each other, are of the same level gi. In this work, we 
considered the color autocorrelogram with distances d = 
[13,5,7] [5]. 

2 ) Color Moments 

Color moments are measures that characterize color 
distribution in an image. Color moments are scaling and 
rotation invariant. We have computed first two color moments 
Mean and Deviation from each R, G, and B channel. 

Mean 

The first color moment can be interpreted as the average color 
in the image, and it can be calculated by using the following 
formula, 


tt 



(3) 

where N is the number of pixels in the image and Py is the 
value of the j-th pixel of the image at the i-th color channel. 

Standard Deviation 

The second color moment is the standard deviation, which is 
obtained by taking the square root of the variance of the color 
distribution. 



where Ei is the mean value, or first color moment, for the i-th 
color channel of the image. 

3) HSV Histogram 

A color histogram is the proportion of the number of 
different types of colors, regardless of the spatial location of 
the colors. During this step following actions are performed: 
Color Space Conversion, Color Quantization and Compute 
Histogram. 
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First RGB color space is converted into HSV color space. 
In the HSV color space, hue is used to distinguish colors, 
saturation is the percentage of white light added to a pure 
color and value refers to the perceived light intensity. In color 
quantization, for every image in the database, colors in the 
HSV model are quantized, to make later computations easier. 
Color quantization reduces the number of distinct colors used 
in an image. Each component is quantized with non-equal 
intervals: H: 8 bins; S: 2 bins and V: 2 bins. Finally we 
concatenate 8X2X2 histogram and get 32-dimensional vector. 
If we use direct values of H, S and V components to represent 
the color feature, it requires lot of computation. So it is better 
to quantify the HSV color space into non-equal intervals. At 
the same time, because the power of human eye to distinguish 
colors is limited, we do not need to calculate all segments. 
Unequal interval quantization according the human color 
perception has been applied on H, S, and V components [8]. 


4) Gabor Filters 

The extraction of texture of an image is accomplished by 
using a set of Gabor Filters. Frequency and orientation 
representations of Gabor filters are similar to those of the 
human visual system, and they have been found to be 
particularly appropriate for texture representation and 
discrimination. Gabor filters are a group of wavelets 
capturing energy at a specific frequency and a specific 
direction. The expansion of a signal using this basis provides 
a localized frequency description, therefore, capturing local 
features/energy of the signal. Texture features can thus be 
extracted from this group of energy distributions [7]. 

For a given image I(x,y) with size PXQ, its discrete Gabor 
wavelet transform is given by a convolution: 

GmJx, y) = VVy (x- s, y - 0^0 0 

* t (5) 

Where, s and t are the filter mask size variables, and is a 


complex conjugate of which is a class of self-similar 
functions generated from dilation and rotation of the 
following mother wavelet: 


?™Avy) = 
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exp 
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( 6 ) 

where W is called the modulation frequency. The self-similar 
Gabor wavelets are obtained through the generating function: 

¥> mn (xy) = a~ m q> (xy) ( 7 ) 

Where m and n specify the scale and orientation of the 
wavelet respectively, with m=0,l,.M-l,n= 


0,1,.N-l. 

After applying Gabor filters on the image with different 
orientation at different scale, we obtain an array of 
magnitudes: 




E (m, n) = / J \ IC mn foy)l 


* >’ (8) 
These magnitudes represent the energy content at different 
scale and orientation of the image. The main purpose of 
texture-based retrieval is to find images or regions with 
similar texture. It is assumed that we are interested in images 
or regions that have homogenous texture, therefore the 
following mean and standard deviation a mn of the 
magnitude of the transformed coefficients are used to 
represent the homogenous texture feature of the region: 


Pmn = 


E 


PXQ 
PXQ 


(9) 

( 10 ) 


A feature vector is created using t^and as the feature 
components. Four scales and Six orientations are used in 
common implementation and the feature vector of length 48 is 
given by: 

f s ~ {^W ' U 3£-' ,£J 3s} q ^ 


5) Discrete Wavelet Transform 

DWT can be performed by iteratively filtering a signal or 
image through the low - pass and high - pass filters, and 
subsequently down sampling the filtered data by two [10]. 
This process will decompose the input image into a series of 
sub band images. 

The DWT decomposition of the image is applied up to third 
level. Wavelet Statistical Features for each level, for each sub 
band (High-High, High-Low, Low-High, Low-Low) are 
calculated. First two moments [9] of wavelet coefficients i.e. 
Mean and Standard coefficient are to be considered. 

Mean: The mean is measurement of average intensity level 
in that sub band. 

Mean = ^ £$= 1 ^/) (12) 

Where C(i,j) is the transformed value in (i,j) for any 
subband of size N x N 

Standard Deviation: The standard deviation of the image 
gives a measure of the amount of detail in that sub band. 

n—r- 

SD = V 5 S5=i[c(£,;) - *2 (13) 

Where C(i,j) is the transformed value in (i,j) for any 
subband of size N x N and m is mean. 

C. Retrieval using Similarity Metrics 

To perform the actual image retrieval we investigated a 
number of vector distance measures to discover which gave 
the most accurate and perceptually correct result. Similarity 
metrics used are LI, L2, Standardized L2, Cityblock, 
Minkowski, Chebyshev, Cosine, Correlation, Spearman, 
Normalized L2 and Relative deviation. 

D. Classification using SVM 

Support vector machine is a supervised learning technique 
that analyzes data and identify pattern used for 
classification. It is different type of classifier that performs 
classification by generating a hyper plane. The optimal 
hyperplane separates data in two categories. The aim of 
support vector machine is to find the optimal hyperplane that 
is used to distinguish group of vectors in a way that one 
category of the required variables is on one side of hyper 
plane and the other class of variables are on the other side of 
plane. Vectors nearer to the hyper plane are called as support 
vectors [11]. The optimum hyper plane can be defined as the 
linear classifier with the maximum margin for a given set of 
variables. SVM is one of the best known methods in pattern 
classification and image classification. 
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It is designed to separate of a set of training images two 

SVM (1-against-1): 







different classes, (xl, yl), (x2, y2),.., (xn, yn) where xi in Rd, 

Accuracy 

= 91.60% 







d-dimensional feature space, and yi in {-1,+1}, the class label, 

Confusion Matrix: 







with i=l..n [12]. SVM builds the optimal separating hyper 
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E. Confusion Matrix 
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A confusion matrix, also known as an error matrix, 
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visualizes the performance of an algorithm, typically a 
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supervised learning one [13]. Each column of the matrix 
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represents the instances in a predicted class, while each row 
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represents the instances in an actual class. The name stems 
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from the fact that it makes it easy to see if the system is 
confusing two classes (i.e. commonly mislabeling one as 
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another). It is a table with n rows and n columns that reports 

Predicted Query Image Belongs to Class = 

= 1 


the number of false positives, false negatives, true positives, 
and true negatives where n is number of classes. This allows 
more detailed analysis than mere proportion of correct 
guesses (accuracy). 


Fig 3. Classification Results 

S“ = = a) o -b -= c 

mSuEiln.Sfr’-c 


F. Precision & Recall 


In information retrieval, precision is the fraction of retrieved 
instances that are relevant, while recall is the fraction of 
relevant instances that are retrieved. Both precision and recall 
are therefore based on an understanding and measure of 
relevance. 


Precision = 


Vurcipgr p/ r-g r. r im^gas 
T GtG [ ratrisvad 


Recall = 


Tjf-g l Kinfljar of ralavz rc r itr^gas 
Number of rsi-sv-c n r images rsrn'svssi 


IV. Implementation AND RESULTS 

Above algorithm is tested on the database of 400 images. 
Images in the database are collected from different open 
sources available on internet. All these images are classified 
into 4 different classes namely Land & Water, Beach, 
Mountain, and Dainos. 

Fig 2 below shows retrieval results for Land & Water class. 
Fig 3 shows classification result for query image in fig 2. Fig 4 
is the confusion matrix for query image. Fig 5 is Precision & 
Recall plot of proposed system. 

Table I summarizes Retrieval accuracy of different classes 
using various distance measures. 




Fig.2 Image retrieval results 


Table I. Retrieval accuracy 



Land & 
Water 

Beach 

Mountain 

Dainos 

LI 

89.40% 

92.00% 

92.00% 

92.00% 

L2 

91.80% 

91.00% 

90.60% 

91.80% 

Std.L2 

91.60% 

89.20% 

92.20% 

91.40% 

Cityblock 

90.40% 

93.00% 

90.80% 

91.40% 

Minkowski 

91.60% 

91.40% 

91.00% 

91.80% 

Chebyshev 

90.80% 

91.40% 

90.60% 

92.40% 

Cosine 

88.80% 

91.20% 

90.60% 

90.60% 

Correlation 

90.20% 

91.00% 

89.80% 

91.80% 

Spearman 

92.00% 

89.80% 

91.60% 

91.40% 

Norm.L2 

90.60% 

92.00% 

91.80% 

92.40% 

Relative 

Deviation 

91.80% 

90.6% 

91.80% 

93.00% 
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V. Conclusion 

In this paper Remote Sensing Image Retrieval is proposed 
and implemented for accurate image search as per user 
interest from digital image database. Images are first 
segmented by using JSEG color segmentation method and 
high level color and texture features are extracted from each 
region. During retrieval phase feature vector of query image is 
compared with that of database images using distance metrics. 
All the distance metrics give more than 90% accuracy on an 
average. Retrieval accuracy for different classes using 
different similarity matrices is compared. Precision-Recall 
plot for proposed system shows good retrieval accuracy once 
system has been trained after few initial retrievals. The idea in 
this paper is most applicable and shows good retrieval results. 
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